Identifying Effective Translations for Cross-lingual Arabic-to-English User-generated Speech Search

نویسندگان

  • Ahmad Khwileh
  • Haithem Afli
  • Gareth J. F. Jones
  • Andy Way
چکیده

Cross Language Information Retrieval (CLIR) systems are a valuable tool to enable speakers of one language to search for content of interest expressed in a different language. A group for whom this is of particular interest is bilingual Arabic speakers who wish to search for English language content using information needs expressed in Arabic queries. A key challenge in CLIR is crossing the language barrier between the query and the documents. The most common approach to bridging this gap is automated query translation, which can be unreliable for vague or short queries. In this work, we examine the potential for improving CLIR effectiveness by predicting the translation effectiveness using Query Performance Prediction (QPP) techniques. We propose a novel QPP method to estimate the quality of translation for an Arabic-Engish Cross-lingual User-generated Speech Search (CLUGS) task. We present an empirical evaluation that demonstrates the quality of our method on alternative translation outputs extracted from an Arabic-to-English Machine Translation system developed for this task. Finally, we show how this framework can be integrated in CLUGS to find relevant translations for improved retrieval performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LiveTrans-Cross-Language Web Search through Live Mining of Query Translations

Enabling users to find effective translations automatically for query terms not included in dictionary is one of the major goals of a practical cross-language Web search service. This paper presents a cross-language Web search system called LiveTrans, which is an experimental metasearch engine that provides English-Chinese cross-lingual retrieval of both Web pages and images. The system has bee...

متن کامل

Cross-lingual sentence extraction for information distillation

Information distillation aims to analyze and interpret large volumes of speech and text archives in multiple languages and produce structured information of interest to the user. In this work, we investigate cross-lingual information distillation, where nonEnglish (source language) documents are searched for user queries that are in English (target language). We propose to perform distillation ...

متن کامل

A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation

One important class of online videos is that of news broadcasts. Most news organisations provide near-immediate access to topical news broadcasts over the Internet, through RSS streams or podcasts. Until lately, technology has not made it possible for a user to automatically go to the smaller parts, within a longer broadcast, that might interest them. Recent advances in both speech recognition ...

متن کامل

A Comparative Analysis of Collocation in Arabic-English Translations of the Glorious Quran

The Qur’an is the only holy book of Muslims all around the world. Each person with any religion and language is interested in comprehending and accepting the rules and regulations of their own belief. Translation of the Qur’an is only an attempt to present its meaning. One of the most challenges in translation of the Qur’an is collocation. A collocation is a sequence of words or terms that co-o...

متن کامل

Identifying Agreement/Disagreement in Conversational Speech: A Cross-Lingual Study

This paper presents models for detecting agreement/disagreement between speakers in English and Arabic broadcast conversation shows. We explore a variety of features, including lexical, structural, durational, and prosodic features. We experiment with these features using Conditional Random Fields models and conduct systematic investigations on efficacy of various feature groups across language...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017